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Abstract. In the practice of information extraction, the input data are usually 
arranged into pattern matrices, and analyzed by the methods of linear algebra 
and statistics, such as principal component analysis. In some applications, the 
tacit assumptions of these methods lead to wrong results. The usual reason is 
that the matrix composition of linear algebra presents information as flowing in 
waves, whereas it sometimes flows in particles, which seek the shortest paths. 
This wave-particle duality in computation and information processing has been 
originally observed by Abramsky. In this paper we pursue a particle view of in- 
formation, formalized in distance spaces, which generalize metric spaces, but 
are slightly less general than Lawvere's generalized metric spaces. In this frame- 
work, the task of extracting the 'principal components' from a given matrix of 
data boils down to a bicompletion, in the sense of enriched category theory. We 
describe the bicompletion construction for distance matrices. The practical goal 
that motivates this research is to develop a method to estimate the hardness of 
attack constructions in security. 

1 Introduction 

Dedication. When Samson Abramsky offered me the position of 'Human Capital Mo- 
bility Research Fellow' in his group at Imperial College back in 1993, I was an ex- 
programmer with postdoctoral experience in category theory. It was a questionable in- 
vestment. Category theoretical models of computation were, of course, already in use 
in theoretical computer science; but the emphasis was on the word 'theoretical'. A cou- 
ple of years later, I left academia to build software using categorical models. While it 
is clear and well understood that Samson's work and results consolidated and enriched 
categorical methods of theoretical computer science, their applications in the practice 
of computation may not be as well known. In the long run, I believe, the impact of 
the methods and of the approach that we learned from Samson will become increas- 
ingly clear, as the abstract structures that we use, including the fully abstract ones, are 
becoming more concrete, more practical, and more often indispensable. 

In the present paper, I venture into an extended exercise in enriched category theory, 
directly motivated by concrete problems of security H17J161 and of data analysis [18|. 
Although the story is not directly related to Samson's own work, I hope that it is ap- 
propriate for the occasion, since he is the originator of the general spirit of categorical 
variations on computational themes, even if I can never hope to approach his balance 
and style. 



Motivation: Distances between algorithms 

Suppose that you are given an algorithm a, and you need to construct another algorithm 
b, such that some predicate P(a, b) is satisfied. Or more concretely, suppose that a is 
a software system, and b should be an attack on a, contradicting a's security claim by 
realizing a property P(a, b). Since reverse engineering is easy 02151 , we can assume that 
the code of a is readily available, and your task is thus to code the attack b. Note that 
a is in principle an algorithmic pattern, that can be implemented in many ways, and 
may have many versions and instances. So your attack b should also be an algorithmic 
pattern, related to a by some polymorphic transformation. The derivation of b from a 
should thus be polymorphic, i.e. a uniform construction: it should be a program p that 
inputs a description of a and outputs a corresponding description p(a) - b. How hard 
is it to find pi An approach to answering such questions is suggested in algorithmic 
information theory 11251 1 31 . The notion of Kolmogorov complexity is that the distance 
from an algorithm a to an algorithm b can be measured by the length of the shortest 
programs that construct b from a, i.e. 

d{a,b)= /\ \ P \ (1) 

p(a)=b 

where \p\ denotes the length of the program p. It is easy to see that the above formula 

+ 

yields the triangle law d(a,b) + d(b,c) > d(a, c), where the superscript ' +' means that 
the uniform order relation > is taken up to a constant, which is in this case the length 
of the program composition operation, needed to get a program to construct c from a 
by composing a program that constructs c from b with a program that constructs b from 
a. Algorithmic information theory always works with such order relations 013141 . The 
equation d(a,a) = holds in the same sense, up to the constant length of the shortest 
identity program, that just inputs and outputs identical data. This distance of algorithms, 
in the style of Kolmogorov complexity, was proposed in [16) as a tool to measure how 
hard it is to construct an attack on a given system. The point was that a system could 
be effectively secure even when some attacks on it exist, provided that these attacks are 
provably hard to construct. The goal of the present note is to spell out some general 
results about distance that turn out to be needed for this particular application. 

But why do we need general results about distances to answer the concrete ques- 
tion about the hardness of constructing attack programs from system programs? The 
reason is that the task of finding an attack algorithm not too far from a system algo- 
rithm naturally leads to the task of construcing a completion of the space around the 
system algorithm. The attacker sees the system, and may be familiar with some other 
algorithms in its neighborhood; but it is not known whether an attack exists, and how 
far it is. The task of discovering the attack is the task of completing the space around the 
system. And the construction of a completion is easier in general, than in some concrete 
cases. 

How does a real attacker search for an algorithm p to derive an attack b from the 
system al He is not trying to guess the construction in isolation, but in the context of 
his algorithmic knowledge. This knowledge has at least two components. On one hand, 
there is some algorithmic knowledge A about the software systems a§,a\,az . . ., and a 



distance measure A x A — U [0, oo] between them, which express how they are related 
with each other. On the other hand, there is some algorithmic knowledge B about the 

du 

attacks bo, oi, »2 • • •, and their distances BxB — > [0, oo]. Last but not least, there is some 
knowledge which attacks are related to which systems. This knowledge is expressed as 

a distance matrix A x B — > [0, oo], where shorter distances suggest easier attacks. In 
order to determine whether there are any attacks in the proximity of a given system 
a, our task is to conjoin the distance space A of systems with the distance space B 

of attacks consistently with the distance matrix A x B — > [0, oo] where the observed 
connections between the systems and attacks are recorded. In this conjoined space, we 
need to find the unknown attacks close to the target system. We find them by completing 
the space of the known attacks. But since the completion is in general an infinite object, 
we first study it abstractly, to determine how to construct just the parts of interest. 

Related work. The completions that we study are based on Lawvere's view of metric 
spaces as enriched categories IfTUl . Lawvere's generalized metric spaces were exten- 
sively used in denotational semantics of programming languages [22 3 9|, and recently 
in ecology |[T2l . following a renewed mathematical interest in the enriched category 
approach ifTTI . In my own work, closely related results arose in the framework of in- 
formation extraction and concept analysis [18]. That work was, however, not based on 
distance spaces as categories enriched in the additive monoid [0, oo], but on proximity 
spaces, or proxets, as categories enriched in the multiplicative monoid [0,1]. Proxets 
are a more natural framework for concept analysis, because they generalize posets, as 
categories enriched over the multiplicative monoid [0, 1}, and the existing theory and 
intuitions are largely based on posets. Distance spaces, on the other hand, appear to be 
a more convenient framework for relating algorithms. 

Outline of the paper. In Sec.|2]we define distance spaces and describe some examples. 
In Sec. [3] we spell out the notions of limit in distance spaces, the basic completion 
constructions, and the adjunctions as they arise from the limit preserving morphisms. 
In Sec. [4] we introduce distance matrices, and describe their decomposition. In Sec. [5] 
we put the previously presented components together to construct the bicompletions of 
distance matrices. Sec. |6]provides a summary of the obtained results and a discussion 
of future work. 

2 Distance spaces 

2.1 Definition and background 

Definition 2.1. A distance space is a set A with a metric cLa '■ A x A — » [0, oo] which is 

— reflexive: d (x, x) — 0, 

— transitive: d (x, y) + d (y, z) > d (x, z), and 

— antisymmetric: d (x,y) — = d (y, x) => x — y 

A contraction between the distance spaces A and B is a function f : A — » B such 
that for all x,y € A holds d& (x,y) > dg (fx,fy). The category of distance spaces and 
contractions is denoted Dist. 



Background. In topology, distance spaces have been studied since the 1930s under the 
name quasi-metric spaces B23I8I . The prefix 'quasi' refers to the fact that the metric 
symmetry law d(x, y) = d(y, x) is not necessarily satisfied. When the antisymmetry law 
is not satisfied either, then the topologists speak of pseudo-quasi-metric spaces |24|. 
Lawvere |10| observed that pseudo-quasi-metric spaces, which he called generalized 
metric spaces, could be viewed as enriched categories [7|. They are enriched over the 
additive monoid [0, oo], viewed as a monoidal category with a uniqe arrow x — > y if 
and only if x > y. The distance d(x,y) e [0, oo] is thus viewed as the 'hom-set' in the 
enriched sense. Lawvere's main result was the characterization of the Cauchy comple- 
tion of a metric space as an enriched category construction. This view of distances and 
contractions turned out to provide an alternative to domains for denotational semantics 
|22|, and their categorical completions were elaborated in [3 9|. Distance spaces as de- 
fined in 12.11 are a special case of generalized metric spaces, since they are required to 
satisfy the antisymmetry law. This is mainly a matter of convenience, as the following 
lemma shows. 

Lemma 2.2. A map d& : A x A — > [0, oo] which is reflexive and transitive in the sense of 
Def. \2.1\ is also antisymmetric if and only if it satisfies either of the following equivalent 
conditions 

- (Vz. d (z, x) = d (z,yj) => x = y 

- (Vz. d (x, z) = d (y, z)) => x = y 

Proof. In the presence of transitivity and reflexivity, d (x, y) = holds if and only if 
Vz. d(z,x) > d(z,y), or equivalently if and only if Vz. d(x,z) < d(y,z). The result 
follows. □ 

Corollary 2.3. Distance spaces are just the skeletal generalized metric spaces. 
2.2 Examples 

The first example of a distance space is, of course, the interval [0, oo] itself, with the 
metric 



The -o notation is convenient because the operation d[o,<x>] =-°: [0, oo] x [0, oo] — > [0, oo] 
makes [0, oo] into a closed category 



Any metric space is obviously an example of a distance space. But in distance spaces, 
the distance d(a, b) from a to b does not have to be the same as the distance d(b, a) 
from b to a. E.g., a may be on a hill, and b in the valley, and traveling one way may be 
easier than traveling the other way. For our purposes described in the Introduction, this 
distinction is quite important, since a program constructing an attack b from a system 




x + y > z <==> x > y -o z 



(3) 



code a does not have to be related in any obvious way to the program performing the 
construction the other way. 

For a non-metric family of distance spaces, take any poset (S, E) and define a dis- 

s 

tance space (WS^ws) by setting dws (x,y) = if x E y, otherwise oo. The other way 

s 

around, any distance space A induces two posets, TA and AA, with the same underlying 
set and 

x E y <=> (1a (x, y) = x E y <=> aU Oc, y) < oo 

TA AA 

The constructions W, T and A form the adjunctions A H W H T : Dist — » Pos. Since 
W : Pos Dist is an embedding, Pos is thus a reflective and correfiective subcategory 
of Dist. 

Distance spaces are thus a common generalization of posets and metric spaces. For 
an example not arising from posets of metric spaces, take any family of sets X E pX, 
and define 

d(jc,y) = |y\je| (4) 

The distance of x and y is thus the number of elements of y that are not in x. If X is 
a set of terms, say in a dictionary, and A" is a set of documents, each viewed as a set 
of terms, then the distance between two documents is the number of terms that occur 
in one document and not in the other. In natural language processing, documents are 
usually presented as multisets (bags) of terms, and the distance is defined in terms of 
multiset subtraction, which generalizes the set difference used in (0). In any case, it is 
clear that the asymmetry of the notion of distance is as essential for such applications 
as it is for the one described in the Introduction. 



2.3 Basic constructions 

Given two distance space A and B, we define: 

- dual A : take the same underlying set and define the dual metric to be d^o (x,y) = 
d A (y, x); 

- product Ax B: take the cartesian product of the underlying sets and set the product 
metric to be d&xB (x, u, y, v) = d& (x, y) V ds (u, v) 

- the power B A : take the set of contractions Dist(A,B) to be the underlying set and 
set the metric to be d B \ (/, g) = Vaga (fx, gx). 

These constructions induce the natural correspondences 

Dist(A, B) x Dist(A, C) = Dist(A, BxC) and Dist(A x B, C) = Dist(A, C B ) 



Terminology. Contractions / : A — > B are called covariant, whereas contractions 
/ : A° — * B are contravatiant. 



3 Sequences and their limits 



3.1 Left and right sequences 

Intuitively, to complete a metric space means to add enough points so that every suitably 
convergent sequence has a limit. But usually many different sequences have the same 
limit. The main problem of the standard theory of completions is to recognize such 
sequences. The categorical approach overcomes this problem by considering canonical 
sequences. Instead of the sequences s, t : N — > A such that lim^o, Sj — iff — lim^oo h, 
we consider a canonical sequence \jj : A — > [0, oo] where xfix intuitively denotes the 
distance from iff to x. 

Definition 3.1. In a distance space A, a (canonical) sequence is defined to be a con- 
traction into [0, oo]. More precisely, we define that 

— a left sequence is a covariant contraction A : A — > [0, oo] 

<— 

- we write its value at x e A as Ax 

— a right sequence is a contravariant contraction g : A° — > [0, oo] 

- we write its value at x e A as xg. 

Each of the sets of sequences 

A = ([0, oo] A )° and ~A = [0, oo] 04 "' 
forms a distance space, with the metrics 

d^ ( A , 6 J = \J 6x -o Ax and d-* i^g,~fl^ = \J x~g -o x~jt 

xeA xeA 

Remarks. The conditions dA(x,y) > Ax -o Ay and d A (x,y) > yg -o xg, which 
say that A and g are left and right contraction respectively, are by (|3) respectively 
equivalent to 

Ax + d(x,y)>Ay d(x,y)+yg>xg 



3.2 Limits 

Definition 3.2. An element u of a distance space A is an upper bound of a right se- 
quence g in A if for all x e A holds 

xg > da (x, u) (5) 

An element I of a distance space A is a lower bound of a left sequence A in A if for 
all y e A holds 



Ay > d A {€,y) 



(6) 



Proposition 3.3. An element u e A is an upper bound g and I e A is a lower bound of 
A if and only if the following conditions hold for all x, y € A 

d A (u, y) > y xq* -o d A (x, y) (7) 

xeA 

d A {x,€)>\J%^d A {x,y) (8) 

yeA 

Proof. Condition @ implies that (O and ( fTOb are respectively equivalent with 

if? + ^ ( M , y) > d A (x, y) (9) 
d A (x, ^) + tly > aU (x,y) (10) 

The claim follows by instantiating y to u in (0 and x to ^ in ([8]). □ 

Definition 3.4. 77;e supremum of the right sequence ~~q and the infimum Y\ A of 

the left sequence A are the elements of A that satisfy for every x,y & A 

d A (U~Q,y) = \/ x~d ^> d A {x,y) (11) 

xe A 

d A (x,l\*j) = \J % ^ d A {x,y) (12) 

yeA 

Suprema and infima constitute the limits of a distance space. 

The distance space A is right ( resp. left) complete if every right ( resp. left) sequence 
has a limit. The suprema and the infima thus yield the operations 

JJ : A —> A and Yl ■ A — > A 

One apparent shortcoming of treating sequences categorically, i.e. saturating them 

to canonical sequences, is that it is not obvious how to define continuity, i.e. how to 

distinguish the contractions which preserve suprema or infima. Clearly, a left continu- 

<— 

ous contraction / : A — » B should map the infimum of a left sequence A in A into the 

<— «— 
infimum of the /-image of A in B. But what is the /-image of A : A — > [0, oo] in B? 

This question calls for a slight generalization of the concept of sequence, and limit. 
3.3 Weighted limits 

Limits are a special case of weighted limits, which are studied in general enriched cate- 
gories Q Ch. 3]. We just sketch theory of weighted limits in distance spaces. 

Definition 3.5. For distance spaces A and K we define 

— left diagrams as pairs of contractions Ik : K — > A, A : K — > [0, oo]j 

- right diagrams as pairs of contractions : K — > A,~q : K" — » [0, oo]j 



Terminology and notation. The component k : K -» A of a diagram is called its 
shape. Using the angular brackets to denote the functions into cartesian products, we 
also write 

- Ik, *Aj : K -> A x [0, oo] for (k : K -» A, !T : # -> [0, oo]^ 

- (i.e") :^->Ax[0, oof for (k:K^> A,~g : K" -> [0, oo]} 

Definition 3.6. TTie weighted supremum JJ-^ £ o/f/ze rigfe diagram (k,~g°) : — > 

A x [0, oof and the weighted infimum Y\^k of the left diagram (k, A ) : K — > A x [0, oo] 
are //ze elements of A that satisfy for every x, y e A 

= -° d A (kx,y) (13) 

«U (*, n^r fe ) = V ** y ~° ^ (x ' (14) 

yeAT 

Remarks. Limits arise as a special case of weighted limits, by viewing sequences as 
diagrams of shape k = id : A — > A. A contraction / : A — > B thus maps, say, a 
left sequence (id, A) : A -» A X [0, oo] to the diagram </, A > : A — > B X [0, oo] in 
B. More generally, it maps a left sequence {k, A) : K —> A x [0,oo] to the diagram 

(/ o k, A ) : — » B x [0, oo] in B. It is thus clear and easy to state what it means that a 
contraction preserves a weighted limit. 

Definition 3.7. A contraction f : A — » B preserves 

- weighted suprema if f (\J^ kj = LLj(/ ° k), and 

- weighted infima if f (fl^) = Il^Cf ° k). 

On the other hand, although convenient to work with, weighted limits of diagrams in 
distance spaces also boil down to the limits of suitable sequences. We just state this fact, 
since it simplifies the construction of the completions; but leave the proof for another 
paper, since the proof construction is not essential for the goal of the present paper. 

Proposition 3.8. A distance space has 

- the weighted suprema of all right diagrams if and only if it has the suprema of all 
right sequences; 

- the weighted infima of all left diagrams if and only if it has the infima of all left 
sequences. 

3.4 Completions 

Every element a of a distance space A induces two representable sequences 



Aa : A -> [0, oo] Va : A" -> [0, oo] 

x i-> d A (a, x) x h-» dA (x, a) 



<— — > 

These induced contractions A : A — > A and V : A — > A correspond to the Yoneda- 

Cayley embeddings lfT31 Sec. III. 2]. They make A into the lower completion, and A 

into the upper completion of the distance space A. 

«— — > 
Proposition 3.9. A is Ze/f complete and A is right complete. Each of them is universal 

among distance spaces with the corresponding completeness properties, in the sense 

that 

— any monotone f : A — » C into a complete distance space C induces a unique 
Y\-preserving morphism f# : A — > C such that f — /# o A; 

- any monotone g : A — > D into a cocomplete distance space D induces a unique 
\J-preserving morphism g # : A — > D such that g — g # o V. 





These constructions for have been thoroughly analyzed in [3|9j. Here we just state 
the basic facts that justify our notations, and substantiate the further developments. 

Proposition 3.10. ("The Yoneda Lemma" j For every q e A and A € A and holds 

d~Q — \j x(Va) -o xq = aU (Va,~p) 

xeA 

% = \J (Ab)x -° *Ax = d x (*A, A/?) 

xeA 

Instantiating in the preceding proposition A to Aa and q to Vb yields 
Corollary 3.11. The embeddings A : A — > A and V : A — » A are isometries 
d A (a,b) = d-j (Va, V/?) = d<-(Aa, Ab) 



3.5 Adjunctions 

Notation. In any distance space A, if is often convenient to abbreviate = to 

x ~* y. For /, p : A — » B, it is easy to see that / 'v* g if and only if ?x for all 

x € A. 

Proposition 3.12. For any contraction f : A — > B holds 

(a) <=> (i) ^=> (c) and (cf) <^> (e) <=> (/) 



where 



(a) /(Ue) = U/(e) 

(b) 3f:B^ASxeASyeB.d B (fx, y) = d A (x, f,y) 

(c) 3f, : B -» A. id A /*/ A //. ^ id B 

w /(nt) = n/(^") 

(e) 3f : B -» A Vx € A Vy € B. d B {f*y, x) = d A (y, fx) 

(f) 3f* : B — > A. /*/ -> id A A id B -> //* 

/sac/; of the morphisms f* and f„ is uniquely determined by f, whenever they exist. 

Definition 3.13. A right adjoint is a contraction satisfying (a-c) of Prop. \3.12\ a left 
adjoint satisfies (d-f). A (distance) adjunction between the distance spaces A and B is a 
pair of contractions f* : A^± B : /* related as in (b-c) and (e-f). 

Equations (ITTb and ([T2l > immediately yield the following fact. 

Proposition 3.14. Limits are adjoints to the Yoneda-Cayley embeddings: 

d A (u~e,y) = v y) dA^,n^j = ^( A ^.^j 

Putting Propositions ^. 12| and |3.14| together yields yet another familiar fact. 

— > 

Proposition 3.15. The sup-completion V : A — » A preserves any infima that exist in A. 
77;e inf-completion A : A — » A suprema that exist exist in A. 

3.6 Projectors and nuclei 

Proposition 3.16. For any adjunction f*:A+±B: f, holds 
(a) <=> (b) and (c) <=> (d) 

where 

(a) Vxy e B. d A (J*x,f*y) = d B (x,y) 

(b) f*f t = id B 

(c) Vxy e A. d B (fx,f*y) = d A (x,y) 

(d) f.r = id A 

Definition 3.17. A map g from a distance space A to a distance space B is an embed- 
ding if it preserves the distance, i.e. satisfies d A (x,y) — d B (gx,gy) for all x,y € A. An 
adjoint of an embedding is called a projection. 

An adjunction p* : A B : e* of a left projection and right adjoint, as in 
Prop. \3~W( a-b), is called a reflection. An adjunction e* : A f± B ; p* of a left em- 
bedding and right projection, as in Prop, \3.16\ c-d), is called a corefiection. 

Definition 3.18. A nucleus of the adjunction f* : A +t B : f, consists of a distance 
space If) together with 

- embeddings A <^ lf§ <^-» B 



- projections A -» £/j «- B 
such that f* — e* p* and f* — e t p t . 

Proposition 3.19. Any adjunction factors through its nucleus by reflection followed by 
a coreflection. The nucleus of the adjunction f*:A^±B: f* is in the form 

lf) = {(x,y)€AxB\rx = yAx = f,y} (15) 

and the factoring is 

r 




f, 



Any right adjoint factors through the nucleus by a right projection followed by a right 
embedding, and any left adjoint factors through the nucleus by a left projection followed 
by a left embedding. This factorization is unique up to isomorphism. 

Proof. For any adjunction f*:A<±B:f f , form the distance spaces 

IfU ={xeA\ fj*x = x] lf] B = [y e B | ff.y = y) 

are easily seen to be isomorphic with the nucleus. The factorisation is thus 




1Kb 

a 



3.7 Cones and cuts 

The cone extensions are the contractions A # and V# 




induced by the universal properties of the Yoneda embeddings V and A, as per Prop. 
Since A # thus preserves suprema, and V# preserves infima, Prop. |3.12l implies that each 
of them is an adjoint, and it is not hard to see that they are adjoint to each other, i.e. 
A* : A <=» A : V#. 

Proposition 3.20. For every g e A every A e A holds 

fe^V # A*e and V # A*e->"pj <^ 3A.~g=A # *A 
^Z^>A # V#*Z and A # V#*I^ Aj <^> Jp. tf=V#"£ 



77ze transpositions make the following subspaces isomorphic 
(a) =fa e~Z\-$ = V#A*jt 
(a) = {*Ae A \*A = A # V # ti 

V /A*V # I 

Proof. Unfolding the definitions of V# and A* gives 

f \ 
a(V#A tt g^ = \J \J x~g ^> d(x,u) -o d(a,u) 

ueA \xeA ) 

which shows that the first claim follows from the fact that for every u e A holds 
Y x~g -o d(x, u) > a~g -o c/ A (a, u) 



ag + 



\J xg -o d(x, u) 



> dA {a, u) 



ag > 



y XQ -o <f (x, m) 



V.teA 



oU (a, «) 



Definition 3.2f . 77;e cones in a distance space A are the sequences in 



V /V # A* 



□ 

and 



|A | .A cut in A is a pair of cones y - (y , V) e |A ) x |A ) smc/i that 
\ /a*v* V /v # a» \ /a»v # 



y = V#y. 77ze set of cuts is denoted by A . 

Lemma 3.22. There are bisections |A| = A = \A\ , extending the 

\ h f A* \ /a*v # 

phism A = A from Prop. 13.201 

V /V # A* V /A*V # 

<— > 

Proposition 3.23. 77ze set of cuts A with the distance defined by 
is a left and right complete distance space. 



Notation. We often abuse notation and write 

- g for the associated cone V#~g, and 
— > # <— 

- A for the associated cone A A . 

. . «— » — > <— » <— 

Proof of Prop. 13.231 The A -infima are constructed in A , the A -suprema in A . To spell 

this out, consider /I : A — » [0, oo] and g : A — » [0, oo]. Extend them along the 
isomorphisms 

L4) m ^^(^H s A* 

V /V # A« V /A*V # 

to get *A~ : tl ) -> [0, oo] and ~g : (X) -> [0, oo]. Then 
V /v # a» V /A«V # 

The claim now boils down to showing that the inclusion I A ) =-> A preserves infima, 

\ /V»A* 

whereas the inclusion I A ) <^-> A preserves the suprema. But this is immediate from 

V /A*V # 

the next Lemma. □ 

Lemma 3.24. The limits of the cut sequences 

7 : A° -> [0, oo] A : A -» [0, oo] 

~K : A° -> [0, oo] <F; A -> [0, oo] 

can be computed as follows 

a(ur) = f\ a~l +Tr (n A)a = A(Va) 

a(uz) = (Aa)# (n = A ^ + 

Corollary 3.25. A distance space A has all suprema if and only if it has all infima. 

Dedekind-MacNeille completion is a special case. If A is a poset, viewed as the 
distance space WA, then WA is the Dedekind-MacNeille completion of A. The above 
construction extends the Dedekind-MacNeille completion of posets lfT4l to distance 
spaces, in the sense that it satisfies in the same universal property, spelled out in JT]. 



4 Distance matrices 
4.1 Definitions 

Definition 4.1. A distance matrix from distance space A to distance space B is a 
sequence : A" X B — * [0, oo]. We denote it by : A ■»-> B, and the value of at x € A 



and y e B is written x<t>y. The matrix composition of <t> : A i-» B and W : B <»-» C is 
c/e/znec/ 



*(<2>; = l\x<Py+y¥z 



With this composition and the identities Y&a : A t-» A where x(\Aa)x' — d& (x, x'\ 
distance spaces and distance space matrices form the category Matr. 

Remark. Note that the defining condition dA (w, x) + d# (y, v) > J (x0y, M0v), which 
says that <t> is a contraction A° x B — » [0, oo], can be equivalently written 

d A (u, x) + x<t>y + d B (y, v) > m0v (16) 

Definition 4.2. Transposing the indices yields the transposed matrix: 

: A ■>-> B : x<Py 
0° : B° A° : y0°x 

77z<? dual $ ' : 5 « A of a matrix & : A B has the entries 
<t> : A <>-» B : x0y 



0* : fi^A : y&*x = \J u&v -° (dA(u,x)+dB(y,v)) 

u£/\ 

veB 

A matrix <t> : A *-» B w/iere 0** = is called a suspension. 

Remarks. The transposition is obviously an involutive operation, i.e. 0°° = 0. It is 
easy to derive from Prop. I3.20l that d$ (x, y) > d$# (x, y) holds for all x e A and y e B, 
and that = 0** holds if and only if there is some ¥ : B i-» A such that = f*. Since 
tp ^ if => tp* -x^ it follows that 0** implies 0* = 0***. 

Proposition 4.3. : A ■>-> B and 0* : B ■>-> A saft's/y ; 0* Id^ ami 0^ ; Id B . 

Proof. The condition ; 0* ~~> Idyi is proven as follows: 

\^ w0v -o (d A (u, x') + d B (y, v)) > x&y -o d A (x, x') 

veB 



x<t>y + 



^ M0v -o (d A (u, x') + d B (y, v)) 



> dA (x, x') 



x&y + y<P'x > (x, x') 
The second condition is proven analogously. 



Definition 4.4. A matrix & : A ■>-> B is embedding if <1>;&$ = Id^; and a projection if 
; = Id B . 

Definition 4.5. A decomposition of a matrix <t> : A t-» B consists of a distance space 
D, with 

- projection matrix P : A i-» Z), i.e. tfo (d,d') = A*eA ^ * + 

- embedding matrix E : D •>-» B, ;.e. a?£> (of, a?') = Ayes + yE*d' r 

such that <t> — P;E, i.e. x<t>y = Aaed x ^ + <^y- 



Matrices as adjunctions. A matrix <t> : A ■*-> B can be equivalently presented as either 
of the two contractions <t>, and <t>', which extend to &„ and 0* using Prop. 



A x B -> [0, oo] 



A^B~ 



<z>* — > 
B — > A 



"A —> B~ 



B -A A 



(0*1? ) b-\J xq -° x<t>b 



xeA 



{<t> t A\a = \J Xy -o a% 

VGii 



Both extensions, and their nucleus, are summarized in the following diagram 



(17) 




(18) 



— r K — 

The adjunction <t>* : A t± B : <t>* means that 



yeB 



holds. The other way around, it can be shown that any adjunction between A and B is 
completely determined by the induced matrix from A to B. 

Proposition 4.6. The matrices 4> e Matr(A, B) are in a bijective correspondence with 
— > <— 

the adjunctions <P* : A B : &„. 

Lemma 4.7. <i^(0*Vx,Ay) = x<t>y - aU (Vjc, <P t Ay) 



4.2 Decomposition through nucleus 

Proposition 4.8. For every a € A every B e B holds 

— > «— > »— » — > *~ — > » <— 

or ^ 0,0 a and <P„0 a a <^=> 3B e B . a - <t> B 

~~> &*<P*)3 and <P*&,B ^> 8 <=> 37? e A. /3 = <£>*"a 

77je adjunction &* . A ^± B . (t> t induces the isomorphisms between the following 
distance spaces 



j = ly = ^7) e A xB~\~y = <P»y~ A = "j] 
with the metric 

di^(y,(p) = d-*{j,~$) = d^{j,1p) 
Definition 4.9. is called the nucleus of the matrix <t>. Its elements are the 0-cuts. 

Theorem 4.10. The nucleus of the adjunction 0* : A B : <t>„ induces the 
decomposition of the matrix <P : A ■»-> B into 

— the projection P* : A 1^ with xP*(a, B) = x~a, and 

- the embedding E* : £c?>5 *-> B with (a, B )E*y = By 

where (a, B) e is an arbitrary <t>-cut, i.e. ~~a — <t>*B and <t>*~a = B. 
Proof (sketch). We prove that <t> — P*; E* as follows: 

x(P*; E*)y = /\ xP*l~a, &*~a) + (a, <P^a)Ey 

a 

< xVx + (<P*Vx)y 

= d A (x, x) + d^(0*Vx, Ay) 

= x<Py 

using Lemma [4771 at the last step. The facts that P* is a projection and E* is an embed- 
ding matrix are proved using the following lemma, which says that \<t>^ is JJ -generated 
by A and ]~[-g enera ted by B. □ 



Lemma 4.11. The "[0^-infima are computed in A, whereas its suprema are computed 

in B. To state this precisely, consider A : 10] — > [0, oo] and~g : \0)° — » [0, oo]. 

Extend them along the isomorphisms \0]a = \0) = l&^B to get A : 10]a — » [0, oo] 
and~g : l&y -> [0, oo]. 77ien 

]~~[!f = tfoV e ^ | |T? - ~~o A ( ^ 

are constructed in A and B, because \0]a ^ A preserves the infima, whereas [0)b 
B preserves the suprema. 

Corollary 4.12. The monotone maps A —* A ^» \0] «- B t- B 

— preserve any infima that exist in A, and any suprema that exist in B, 

- generate \<t>\ by the suprema from A and by the infima from B, in the sense that for 

any (a,B)e {0] holds 

]_jv = = ri A 



5 Bicompletion 

Any distance space morphism / : A — » B induces two matrices, Q.f : A B and 
U/ : B i-* A with 

x£lfy = d B {fx, y) yUfx = d B (y, fx) 

Lemma 5.1. For every matrix Q,f : A f-» B induced by a distance space morphism 
f-.A-^B holds n/* = 13 f. 

Proof. Since yU/x = d B (y,fx) by definition, the claim boils down to y(D.f)°x = 
d B (y, fx), which can be proved as follows 

y(Qf)°x = \j d B ifu, v) -o (d B (y, v) + d A (u, x)) 

ueA 
veB 

> d B (fx, fx) -e (d B (y, fx) + d A (x, x)) = d B (y, fx) 

a 



5.1 Nucleus as a completion 

Lemma 5.2. If the distance space B is complete, then for any matrix <t> : A i-» B there 
is a distance space morphism f : A — > B such that = Q/. 

Corollary 5.3. If both A and B are complete, then any matrix : A <>-» B corresponds 
to an adjunction 0* : A +± B : 0* such that = £10* = 150*. 



Definition 5.4. A distance matrix homomorphism h : — » r where : A ■<-» B and 
r : C ■>-» A is a pair of contractions h — (ho : A — > C, /zj : Z? — » D) smc/z fna? 

- Mi o ;r=0 ; Mil 

- ho preserves any suprema that may exist in A, 

- hi preserves any infima that may exist in B. 

Let MMat denote the category of distance space matrices and matrix morphisms. 

Definition 5.5. A matrix : A i-» B is complete if A has suprema and B infima^ and 
: A" X B — > [0, oo] preserves the infima. Let CMat denote the category of complete 
matrices and matrix homomorphisms. 

Proposition 5.6. Id^ : \0^ ■>-> "[0^ is the completion of : A B. In other words, 
the functor : MMat -> CMat is /e/f adjoint to the full inclusion CMat MMat. 
77;e Mnif of the adjunction r\ — (tjq, t]i) : — * "[0^ consists of 

t]q : A — > ~A —* \0^ and rji : B A zf 

6 Summary and discussion 

Given an arbitrary distance matrix <?> : A "»-» Z?, we have constructed the completion 
0^>\0^ such that 

- A —* "[0^ is ]J-generating and n-P reservm g' 

- B —* £0j is I~[-g eneratm g an d ]J-preserving. 

In terms of the motivating example of program transformations, and of the task of con- 
joining the algorithmic knowledge about systems and about attacks, every 0-cut is thus 
a supremum of the system specifications in A, and an infimum of the attack specifica- 
tions in B. Moreover, the suprema of 0-cuts can be computed in B, whereas the infima 
can be computed in A . While the suprema^ capture composite systems validating some 
composite properties, the infima describe composite attacks where the invalidated prop- 
erties add up. 

But what has been achieved by providing this very abstract account? It turns out 
that the actual completions provide fairly concrete information. There is no space to 
illustrate this, but we sketch a high level view. The prior knowledge, represented by 
the distance spaces A and B is updated by the empiric data, represented by the matrix 
: A ■»-> B. In the completion \ 0), the empiric relations of as and bs are expressed as 
distances. Following 112 11131 Ch. 4], the task of explaining these empiric links can then 
be viewed as the task of finding short programs p with p(a) = b. After such completions, 
some distances previously recorded in A and B may increase, since some programs may 
be closer related a posteriori than a priori. 

1 By Corollary |3.25l both A and B are thus complete 

2 not unlike colimits of software specifications 1 20 19 1 



The obvious task for future work is to refine the concrete applications of the pre- 
sented construction. This is to some extent covered in the full paper, which is in prepa- 
ration. The further work on quantifying the hardness of program derivations, and of 
program transformations, branches in many directions. Distances arise naturally in this 
framework, as described already in lfT6] Sec. 4.2]. In a different direction, it seems inter- 
esting to study the bicompletions in other categorical frameworks, in particular where 
the dualities fail in a significant way, as demonstrated a long time ago |6). 
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